Integrating sequence variation and protein structure to identify sites under selection.
نویسندگان
چکیده
We present a novel method to identify sites under selection in protein-coding genes. Our method combines the traditional Goldman-Yang model of coding-sequence evolution with the information obtained from the 3D structure of the evolving protein, specifically the relative solvent accessibility (RSA) of individual residues. We develop a random-effects likelihood sites model in which rate classes are RSA dependent. The RSA dependence is modeled with linear functions. We demonstrate that our RSA-dependent model provides a significantly better fit to molecular sequence data than does a traditional, RSA-independent model. We further show that our model provides a natural, RSA-dependent neutral baseline for the evolutionary rate ratio ω = dN/dS Sites that deviate from this neutral baseline likely experience selection pressure for function. We apply our method to the influenza proteins hemagglutinin and neuraminidase. For hemagglutinin, our method recovers positively selected sites near the sialic acid-binding site and negatively selected sites that may be important for trimerization. For neuraminidase, our method recovers the oseltamivir resistance site and otherwise suggests that few sites deviate from the neutral baseline. Our method is broadly applicable to any protein sequences for which structural data are available or can be obtained via homology modeling or threading.
منابع مشابه
Mitochondrial DNA variation in wild and hatchery populations of northern pike, Esox lucius L.
Esox lucius is an economically important freshwater species. Mitochondrial cytb, 12SrRNA, and 16SrRNA gene sequences were used in order to clarify the genetic variation and population structure in three E. Lucius populations, i.e., one Wild population (W) and two hatchery populations (Hatchery Population I-HPI and Hatchery Population II-HPII). A total of 55 individuals, with 19 from wild and 1...
متن کاملMitochondrial DNA variation in wild and hatchery populations of northern pike, Esox lucius L.
Esox lucius is an economically important freshwater species. Mitochondrial cytb, 12SrRNA, and 16SrRNA gene sequences were used in order to clarify the genetic variation and population structure in three E. Lucius populations, i.e., one Wild population (W) and two hatchery populations (Hatchery Population I-HPI and Hatchery Population II-HPII). A total of 55 individuals, with 19 from wild and 1...
متن کاملVariation in the Analysis of Positively Selected Sites Using Nonsynonymous/Synonymous Rate Ratios: An Example Using Influenza Virus
Sites in a gene showing the nonsynonymous/synonymous rate ratio (ω) >1 have been frequently identified to be under positive selection. To examine the performance of such analysis, sites of the ω ratio >1 in the HA1 gene of H3N2 subtype human influenza viruses were identified from seven overlapping sequence data sets in this study. Our results showed that the sites of the ω ratio >1 were of sign...
متن کاملWithin- and between-species DNA sequence variation and the 'footprint' of natural selection.
Extensive DNA data emerging from genome-sequencing projects have revitalized interest in the mechanisms of molecular evolution. Although the contribution of natural selection at the molecular level has been debated for over 30 years, the relevant data and appropriate statistical methods to address this issue have only begun to emerge. This paper will first present the predominant models of neut...
متن کاملNovel Small Molecules against Two Binding Sites of Wnt2 Protein as potential Drug Candidates for Colorectal Cancer: A Structure Based Virtual Screening Approach
Wnts are the major ligands responsible for activating Wnt signaling pathway through binding to Frizzled proteins (Fzd) as the receptors. Among these ligands, Wnt2 plays the main role in the tumorigenesis of several human cancers especially colorectal cancer (CRC). Therefore, it can be considered as a potential drug target.The aim of this study was to identify potential drug candidates ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Molecular biology and evolution
دوره 30 1 شماره
صفحات -
تاریخ انتشار 2013